Skip to content

WIP to add dsi failover task #13

Draft
huayu-ouyang wants to merge 3 commits into
mainfrom
SERVER-119445
Draft

WIP to add dsi failover task #13
huayu-ouyang wants to merge 3 commits into
mainfrom
SERVER-119445

Conversation

@huayu-ouyang
Copy link
Copy Markdown
Collaborator

@huayu-ouyang huayu-ouyang commented Apr 30, 2026

Main changes:

  • new TSBS flags to retry writes

  • new TSBS flag to add _id to every write so we can retry writes including across a driver panic

  • There's some changes to potentially ignore fatal errors and still log a "summary of workload" line at the end but I am probably going to remove those

I did use AI a lot to debug/implement things here as I'm not very famililar with the TSBS repo so would appreciate any feedback


var spPool = &sync.Pool{New: func() interface{} { return &singlePoint{} }}

// makeDeterministicID builds a 12-byte _id from the identity fields of a
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code for the deterministic IDs was all done by AI and I am not the most knowledge about this area so let me know if there's a better way to do this

func (t *mongoTarget) TargetSpecificFlags(flagPrefix string, flagSet *pflag.FlagSet) {
flagSet.String(flagPrefix+"url", "mongodb://localhost:27017/", "Mongo URL.")
flagSet.Duration(flagPrefix+"write-timeout", 10*time.Second, "Write timeout.")
flagSet.Duration(flagPrefix+"socket-timeout", 10*time.Second, "Mongo client SocketTimeout. Bounds how long the driver waits on a single socket read/write before giving up.")
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically the intent was to separate out the serverSelectionTimeout vs the socketTimeout vs writeTimeout


loader.RunBenchmark(benchmark)

// code for the not fatal logging workaround
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might delete this

@@ -0,0 +1,38 @@
package main
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is all for the ignoring fatal error workaround

// In both cases we return only the documents that still need to be (re)tried,
// and we drop docs whose per-write error is non-retryable (e.g. duplicate key
// from a prior partially-successful attempt).
func remainingDocsAfterPartialFailure(docs []interface{}, bwe mongo.BulkWriteException, ordered bool) []interface{} {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're doing InsertMany, this is so that we can retry some writes if a failover occurs in the middle. Again I wasn't sure if there was a better way to do this?

//
// We match on the runtime panic message specifically so that unrelated
// panics (nil-pointer in user code, OOM, etc.) are not treated as retryable.
func isKnownInsertManyResultSlicePanic(v interface{}) bool {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the go driver panic that we want to retry on

@huayu-ouyang huayu-ouyang requested review from dhly-etc and kmahar April 30, 2026 16:35
return fmt.Sprintf("panic in mongo InsertMany: %v", e.value)
}

// isKnownInsertManyResultSlicePanic reports whether the recovered panic
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry these (AI) comments are a little verbose. I figured i would leave them for now but I will probably rewrite them before merging

@huayu-ouyang huayu-ouyang requested a review from seanzimm April 30, 2026 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant